Skip to content

Add overlay[local] annotations#21116

Open
tausbn wants to merge 13 commits intomainfrom
tausbn/python-add-dataflow-overlay-annotations
Open

Add overlay[local] annotations#21116
tausbn wants to merge 13 commits intomainfrom
tausbn/python-add-dataflow-overlay-annotations

Conversation

@tausbn
Copy link
Contributor

@tausbn tausbn commented Jan 7, 2026

Makes the CodeQL Python analysis overlay-aware, up to and including DataFlow::Node.

The rough rule of thumb in these changes is:

  • AST, CFG, SSA and DataFlow::Node are all local,
  • but everything that involves the call graph is (for the moment) considered to be inherently global, and annotated accordingly.

Finally, after adding all of the overlay annotations, a few join-order fixes were needed in order to get performance back to normal. These have been added at the end of this PR.

@github-actions github-actions bot added the Python label Jan 7, 2026
@tausbn tausbn force-pushed the tausbn/python-add-dataflow-overlay-annotations branch from e203d8b to 97e2376 Compare January 9, 2026 16:27
@tausbn tausbn force-pushed the tausbn/python-add-dataflow-overlay-annotations branch from 97e2376 to 9e43da9 Compare January 30, 2026 13:52
... and everything else that it depends on.
None of these required any changes to the dataflow libraries, so it
seemed easiest to put them in their own commit.
These were causing the repo `gufolabs/noc` to spend ~30 seconds
evaluating `ControlFlowNode.strictlyDominates`. Just in case, I added
`overlay[caller] to the other instances of `pragma[inline]` as well.
On `keras-team/keras`, this was producing ~200 million intermediate
tuples in order to produce a total of ... 2 tuples.

After the refactor, max intermediate tuple count is ~80k for the
charpred (and 4 for the new helper predicate).
This caused a ~30x blowup in intermediate tuples, now back to baseline.
@tausbn tausbn force-pushed the tausbn/python-add-dataflow-overlay-annotations branch from c949417 to 304cd12 Compare February 16, 2026 13:48
@tausbn tausbn force-pushed the tausbn/python-add-dataflow-overlay-annotations branch from 0ca3f60 to cd62cda Compare February 16, 2026 22:27
@tausbn tausbn marked this pull request as ready for review February 18, 2026 12:52
@tausbn tausbn requested a review from a team as a code owner February 18, 2026 12:52
Copilot AI review requested due to automatic review settings February 18, 2026 12:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes the CodeQL Python analysis overlay-aware by adding overlay annotations throughout the Python QL libraries. The changes enable overlay evaluation, which should significantly improve performance when a base database already exists. According to the PR description, AST, CFG, SSA, and DataFlow::Node are marked as local, while call graph-related functionality remains global.

Changes:

  • Added overlay[local], overlay[local?], overlay[global], and overlay[caller] annotations to Python QL library modules, classes, and predicates
  • Introduced join-order optimization helper predicates with pragma[nomagic] to maintain performance after overlay annotations
  • Added final class aliases and converted to instanceof patterns to enable extending overlay[local] classes from non-overlay contexts

Reviewed changes

Copilot reviewed 55 out of 55 changed files in this pull request and generated no comments.

Show a summary per file
File Description
python/ql/lib/change-notes/2026-02-18-add-overlay-annotations.md Documents the overlay evaluation compatibility changes
python/ql/lib/semmle/python/**/*.qll Adds overlay[local] module annotations to AST, CFG, SSA, and dataflow libraries
python/ql/lib/semmle/python/dataflow/new/**/*.qll Adds overlay[local] to dataflow components and overlay[global] to call graph-related predicates
python/ql/lib/semmle/python/frameworks/*.qll Adds overlay[local?] module annotations and join-order helpers to framework models
python/ql/lib/semmle/python/objects/TObject.qll Adds join-order helper predicate for missing_imported_module
python/ql/lib/semmle/python/internal/CachedStages.qll Adds overlay[local] to AST and DataFlow stage predicates
python/ql/src/analysis/ImportFailure.ql Introduces FinalControlFlowNode alias and removes override keyword
python/ql/src/Variables/LoopVariableCapture/LoopVariableCaptureQuery.qll Introduces FinalAstNode alias and converts to instanceof pattern
python/ql/lib/analysis/DefinitionTracking.qll Introduces FinalExpr alias for NiceLocationExpr
python/ql/test/library-tests/**/*.qll Adds overlay annotations to test helper modules

Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments